Logic and Physical Synthesis Methodology for High Performance VLIW/SIMD DSP Core

نویسندگان

  • Jagesh Sanghavi
  • Helene Deng
  • Tony Lu
چکیده

We describe logic and physical synthesis methodology to achieve timing closure on a high-end VLIW/SIMD DSP processor core. The design comprises of approximately 200,000 placeable instances. The target frequency goal was to achieve 250 MHz in 130 nm technology. The VLIW/SIMD DSP is described using TIE (Tensilica Instruction Extension) language, which is a Verilog-like language for description of Instruction Set Architecture (ISA) extensions. The synthesizable Verilog (or VHDL) is generated using TIE Compiler. TIE Compiler automatically generates complex hardware structures such as multi-ported register files and pipeline management logic. This empowers a designer to easily add custom instructions to suit the target application. However, this also leads to physical design challenges. For example, the DSP core has 7-stage pipeline, 160-bit VLIW datapath comprising of ALU, MAC, and SHIFT units, a 160bit wide multi-ported register file with 16 entries. This leads to complex bypassing logic, which results in a highly congested design. The traditional flow of using synthesis followed with placement and routing lead to a serious timing convergence issue. The use of Physical Compiler helped close the timing gap. In particular, the best results were found using timing driven congestion with high map effort compile for gates-to-placed-gates flow. One of the major challenge is the turnaround time which can be anywhere from few days to a week depending on the speed of the compute servers. In this paper, we describe the gates-to-placed-gates based Physical Compiler methodology that enabled our team to achieve the target frequency.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

NeuroMatrix® NM6403 DSP with Vector/Matrix engine

The paper describes the architecture of the NeuroMatrix® NM6403 DSP designed for image processing, signal processing and neural networks emulation [1,2]. The paper includes a brief description of the processor structure and its instruction set. The NM6403 is the first DSP based on NeuroMatrix® Core (NMC) comprises an original 32-bit VLIW RISC processor and a 64-bit SIMD Vector co-processor (VCP...

متن کامل

Automatic instruction-set architecture synthesis for VLIW processor cores in the ASAM project

The design of high-performance application-specific multi-core processor systems still is a time consuming task which involves many manual steps and decisions that need to be performed by experienced design engineers. The ASAM project sought to change this by proposing an automatic architecture synthesis and mapping flow aimed at the design of such application specific instruction-set processor...

متن کامل

Real-Time Multimedia Workshop on Embedded Systems for Real-Time Multimedia

This paper discusses the basic differences betweenhorizontal and vertical vector architectures and theadvantages and disadvantages of each approach forDSP applications. In addition, a new vertical DSParchitecture targeted for embedded DSP applicationsis described in detail. The advantages of hard coresas compared with soft cores will also be discussed.DSP Landsca...

متن کامل

Evaluating Signal Processing and Multimedia Applications on SIMD, VLIW and Superscalar Architectures

This paper aims to provide a quantitative understanding of the performance of DSP and multimedia applications on very long instruction word (VLIW), single instruction multiple data (SIMD), and superscalar processors. We evaluate the performance of the VLIW paradigm using Texas Instruments Inc.’s TMS320C62xx processor and the SIMD paradigm using Intel’s Pentium II processor (with MMX) on a set o...

متن کامل

Evaluating VLIW and SIMD Architectures for DSP and Multimedia Applications

Digital signal processing (DSP) and multimedia applications are expected to be the dominant workloads on future computer systems. In this paper, we evaluate the performance of a very long instruction word (VLIW) processor using Texas Instruments Inc.’s TMS320C6x and a single-instruction multiple-data (SIMD) processor using Intel’s Pentium II processor (with MMX) on a set of benchmarks. Our benc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003